A Hybrid Approach Based on Self-Organizing Neural Networks and the K-Nearest Neighbors Method to Study Molecular Similarity
نویسندگان
چکیده
The “Molecular Similarity Principle” states that structurally similar molecules tend to have similar properties—physicochemical and biological. The question then is how to define “structural similarity” algorithmically and confirm its usefulness. Within this framework, research by similarity is registered, which is a practical approach to identify molecule candidates (to become drugs or medicines) from databases or virtual chemical libraries by comparing the compounds two by two. Many statistical models and learning tools have been developed to correlate the molecules’ structure with their chemical, physical or biological properties. The role of data mining in chemistry is to evaluate “hidden” information in a set of chemical data. Each molecule is represented by a vector of great dimension (using molecular descriptors), the applying a learning algorithm on these vectors. In this paper, the authors study the molecular similarity using a hybrid approach based on Self-Organizing Neural Networks and Knn Method. process of laboratory experimentation. This process, from hit to lead to marketable drug, is typically as long as 5-10 years. In order to identify new molecules susceptible to become medicines, the pharmaceutical research has more and more resort to technologies permitting DOI: 10.4018/ijcce.2011010106 76 International Journal of Chemoinformatics and Chemical Engineering, 1(1), 75-95, January-March 2011 Copyright © 2011, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited. to synthesize a very big number of molecules simultaneously and to test their actions on a given therapeutic target. These data can be exploited to construct the models permitting to predict the properties of molecules not yet tested, even not yet synthesized. Looking for molecular similarity is an intelligent way to design drug. Its use is based on the principle that structurally more similar molecules are more likely to exhibit similar properties than structurally less similar molecules (Monev, 2004; Johnson & Maggiora, 1990). Such predictive models are very important because they make it possible to suggest the synthesis of new molecules, and to eliminate very early in the molecule’s search process the molecules whose properties would prevent their use as medicine. We speak then of virtual sifting. Hence, searching for functionally similar molecules, which is very important in drug design, can be accomplished by searching for structurally similar molecules (van de Waterbeemd & Gifford, 2003). But the problem is to define molecular similarity.
منابع مشابه
A Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors
Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...
متن کاملPrototype Generation Using Self-Organizing Maps for Informativeness-Based Classifier
The k nearest neighbor is one of the most important and simple procedures for data classification task. The kNN, as it is called, requires only two parameters: the number of k and a similarity measure. However, the algorithm has some weaknesses that make it impossible to be used in real problems. Since the algorithm has no model, an exhaustive comparison of the object in classification analysis...
متن کاملA New Hybrid Approach of K-Nearest Neighbors Algorithm with Particle Swarm Optimization for E-Mail Spam Detection
Emails are one of the fastest economic communications. Increasing email users has caused the increase of spam in recent years. As we know, spam not only damages user’s profits, time-consuming and bandwidth, but also has become as a risk to efficiency, reliability, and security of a network. Spam developers are always trying to find ways to escape the existing filters therefore new filters to de...
متن کاملComparison and evaluation of the performance of data-driven models for estimating suspended sediment downstream of Doroodzan Dam
Dams control most of the sediment entering the reservoir by creating static environments. However, sediment leaving the dam depends on various factors such as dam management method, inlet sediment, water height in the reservoir, the shape of the reservoir, and discharge flow. In this research, the amount of suspended sediment of Doroodzan Dam based on a statistical period of 25 years has been i...
متن کاملMonthly runoff forecasting by means of artificial neural networks (ANNs)
Over the last decade or so, artificial neural networks (ANNs) have become one of the most promising tools formodelling hydrological processes such as rainfall runoff processes. However, the employment of a single model doesnot seem to be an appropriate approach for modelling such a complex, nonlinear, and discontinuous process thatvaries in space and time. For this reason, this study aims at de...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IJCCE
دوره 1 شماره
صفحات -
تاریخ انتشار 2011